feat: integrate pyannote-cloud into listener2 plugin#4161
Open
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
Open
feat: integrate pyannote-cloud into listener2 plugin#4161devin-ai-integration[bot] wants to merge 1 commit intomainfrom
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
Conversation
- Add hypr-pyannote-cloud dependency to listener2 - Add BatchProvider::Pyannote variant - Implement run_batch_pyannote with media upload, diarization job submission, polling, and response mapping - Add Pyannote error variant to error.rs - Map pyannote TranscriptionSegments to owhisper batch::Response format with speaker indices Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Contributor
Author
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
✅ Deploy Preview for hyprnote-storybook canceled.
|
✅ Deploy Preview for hyprnote canceled.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat: integrate pyannote-cloud into listener2 plugin
Summary
Adds
pyannote-cloudas a new batch transcription provider in thelistener2plugin. Unlike existing providers that use WebSocket streaming viaowhisper-clientadapters, pyannote uses a REST-based async job model:media://presigned URLsPOST /v1/diarizewithtranscription: true)GET /v1/jobs/{jobId}until completion (2s interval, 10min timeout)TranscriptionSegmentresults into the existingowhisper_interface::batch::ResponseformatChanges:
plugins/listener2/Cargo.toml— addedhypr-pyannote-cloud,reqwest,serde_json,uuiddepsplugins/listener2/src/error.rs— addedPyannote(String)error variantplugins/listener2/src/ext.rs— addedBatchProvider::Pyannotevariant + ~250 lines of implementation (make_pyannote_client,pyannote_upload_audio,pyannote_poll_job,pyannote_diarization_to_batch_response,run_batch_pyannote)Review & Testing Checklist for Human
pyannote_diarization_to_batch_responseprefersword_level_transcriptionoverturn_level_transcription. Verify that the word-level segments from pyannote actually have per-word granularity (not full sentences), and that the space-joined transcript makes sense for both levels.1.0: pyannote doesn't provide per-word confidence. Verify downstream consumers handle this synthetic value correctly and don't misinterpret it.listen_params/modelare not forwarded:run_batch_pyannoteignoreslanguages,keywords, andmodelfromBatchParams. TheDiarizeRequest.modelis set toNone(defaults toprecision-2). Decide ifparams.modelshould be forwarded.tokio::fs::read): Could be an issue for very large recordings. Other providers use chunked streaming.Notes
GetJobByIdResponseis#[serde(untagged)]withDiarizationJobas the first variant, which means it will greedily match diarization responses. This works for the diarize use case but the enum ordering matters.reqwest::Clientthat includesAuthorization: Bearer {api_key}headers. The presigned URL upload uses a separate unauthenticated client (correct behavior).PYANNOTE_POLL_TIMEOUT), poll interval is 2 seconds (PYANNOTE_POLL_INTERVAL).Link to Devin run: https://app.devin.ai/sessions/855b2a5608234efaab84b8aaec1a9550
Requested by: @yujonglee